Identifying Top k Dominating Objects over Uncertain Data
نویسندگان
چکیده
Uncertainty is inherent in many important applications, such as data integration, environmental surveillance, location-based services (LBS), sensor monitoring and radio-frequency identification (RFID). In recent years, we have witnessed significant research efforts devoted to producing probabilistic database management systems, and many important queries are re-investigated in the context of uncertain data models. In the paper, we study the problem of top k dominating query on multi-dimensional uncertain objects, which is an essential method in the multi-criteria decision analysis when an explicit scoring function is not available. Particularly, we formally introduce the top k dominating model based on the state-of-the-art top k semantic over uncertain data. We also propose effective and efficient algorithms to identify the top k dominating objects. Novel pruning techniques are proposed by utilizing the spatial indexing and statistic information, which significantly improve the performance of the algorithms in terms of CPU and I/O costs. Comprehensive experiments on real and synthetic datasets demonstrate the effectiveness and efficiency of our techniques.
منابع مشابه
Top-k Dominating Queries: a Survey
Top-k dominating queries combine the advantages of top-k queries and skyline queries, and eliminate their disadvantages. They return k objects with the highest domination score, which is defined as the number of dominated objects. As a top-k query, the user can bound the number of returned results through the parameter k, and like a skyline query a user-selected scoring function is not required...
متن کاملPhD Thesis Efficiently and Effectively Processing Probabilistic Queries on Uncertain Data Candidate
Uncertainty is inherent in many real applications. Uncertain data analysis and query processing has become a critical issue and has attracted a great deal of attention in database research community recently. The thesis, therefore, targets an important and challenging topic uncertain data management. It is a high quality and well-written PhD thesis. Five important and related aspects of uncerta...
متن کاملMetric-Based Top-k Dominating Queries
Top-k dominating queries combine the natural idea of selecting the k best items with a comprehensive “goodness” criterion based on dominance. A point p1 dominates p2 if p1 is as good as p2 in all attributes and is strictly better in at least one. Existing works address the problem in settings where data objects are multidimensional points. However, there are domains where we only have access to...
متن کاملEfficient Processing of Top-k Dominating Queries on Multi-Dimensional Data
The top-k dominating query returns k data objects which dominate the highest number of objects in a dataset. This query is an important tool for decision support since it provides data analysts an intuitive way for finding significant objects. In addition, it combines the advantages of top-k and skyline queries without sharing their disadvantages: (i) the output size can be controlled, (ii) no ...
متن کاملTop-k Queries Over Uncertain Scores
Modern recommendation systems leverage some forms of collaborative user or crowd sourced collection of information. For instance, services like TripAdvisor, Airbnb and HungyGoWhere rely on user-generated content to describe and classify hotels, vacation rentals and restaurants. By nature of such independent collection of information, the multiplicity, diversity and varying quality of the inform...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014